June 1994 - no changes from October 1993 The QRZ! Ham Radio CDROM Callsign Database Technical Specification October 1993 by Fred Lloyd, AA7BQ Note: the following information is provided for those who wish to access the callsign database information without the use of the QRZ! software. Users of the QRZ! Ham Radio CDROM who wish to write their own callsign database search and retrieval software are encouraged to do so. We welcome user contributed shareware programs for future versions of the QRZ! Ham Radio CDROM. BACKGROUND The support software for the QRZ! callsign database is supplied in three versions, two for the PC and one for the UNIX operating system. None of the versions are supported by Walnut Creek CDROM or by the author. If you are not satisfied with the QRZ! software you may return the CDROM to Walnut Creek for a full refund. PC VERSIONS For the PC, there are two versions of the QRZ! software, one for DOS and one for Windows 3.1. Both versions use the same copy of the database and the same general search and retrieval methods. The following discussion applies to both versions of the search software. Any differences will be noted accordingly. The QRZ callsign database indexing and retrieval method was designed and optimized for CDROM use. The primary goal was to provide fast access for searches in light of the fact that typical CDROM seek times are somewhat slow compared to hard disk. The algorithm described below implements a methodology which requires only one CDROM disc seek per lookup. The QRZ callsign database is organized as a set of four sorted copies of the complete list of amateur radio records. One copy is sorted by callsign, one by last name, one by city-state and one by zip code. There is a corresponding index file for each of the sorted databases. The index file contains selected keys from the corresponding database sampled at regular intervals. The key sampling interval is stored in the index header as bytesperkey, as shown in the structure definition below. New Index Header Format The layout of the index header was changed from the last version of the QRZ! Ham Radio CDROM in order to accommodate a machine independent format. All index header fields are now stored as null terminated ASCII strings. Both the old and the new index header formats are shown below: /* ** Old Index Header Block Definition ** ** This block is located at the start of each index */ typedef struct { char dataname[13]; /* Name of the data file */ ulong bytesperkey; /* Data Bytes per Index Item */ uint numkeys; /* Number of items in this index */ uint keylen; /* Length of each key item in bytes */ } old_index_header ; The old index header format is obsolete and will not be used in the current or future versions of the database. /* ** New Index Header Block Definition ** ** This block is located at the start of each index */ typedef struct { char dataname[16]; /* Name of the data file */ char bytesperkey[8]; /* Data Bytes per Index Item */ char numkeys[8]; /* Number of items in this index */ char keylen[8]; /* Length of each key item in bytes */ char version[8]; /* Database Version ID */ } index_header; The old index type may be quickly distinguished by looking for a '0' at offsets 16 and 17. For example, suppose you have read the header structure into a character array called 'hdr'. The following will identify the old header style on a machine with 16 bit integers (i.e. a PC): if ((int)hdr[16] == 0) /* header is old style */ else /* header is new style */ This works beause 'bytesperkey' in the new style is never zero and the value is always left justified in the structure member. Index Usage The name index is set to a maximum of 16 characters with longer names being truncated. Names are stored in last-first format with a space between the names. The city/state index uses 12 characters per entry, the callsign index 6 characters and the zip code index 5 characters. The data which follows the header is simply a long list of single field records. The records are tightly packed on 'bytesperkey' boundaries without separators. There is no guarantee of a null terminator on any index record entry. When the program qrz.exe is run it first searches for a drive containing the base directory \CALLBK . Next, it loads all four index files (callbkc.idx, callbkn.idx, callbks.idx and callbkz.idx) into tables in memory. These tables were kept small so as not to place an undue RAM requirement on the user's system. Next, when a user specifies a field and key to search, the program searches the relevant index table and returns the closest match lower (or equal to) the supplied key. The table position of this key is then taken and multiplied by the 'bytesperkey' value to arrive at a database file offset. This offset is then used to perform the first and only seek into the database. Once on position within the file, a sequential search is performed to return the match. The search terminates at the next index key value if the field is not found. The database files all have the same format. The records each consist of comma separated fields which end with a single newline '\n' (ASCII 0xa) character. Blank fields are simply stored as a comma. Every record has the same number of commas in it. Actual comma's in the data field are stored as a semi-colon ';' which should be replaced by a comma in the user's output formatting routine. /* ** Standard Record Format */ char callsign[6]; /* Call Sign Decoded */ char lastname[32]; /* Last Name */ char namesuffix[2]; /* Name Suffix */ char frstname[32]; /* First Name */ char middleinit[1]; /* Middle Initial */ char streetaddr[32]; /* Street Address */ char city[32]; /* City */ char state[2]; /* State Code */ char datelicensed[5]; /* Date Licensed mm/dd/yy */ char dateborn[5]; /* Date Born */ char license_class[1]; /* License Class */ char zipcode[5]; /* Zip Code */ char prevcall[6]; /* Previous Call */ char prevclass[1]; /* Previous Class */ The callsign fields are arranged in a strict "ccdccc" columnar format where 'c' represents a letter and 'd' a digit. Callsigns which do not conform to the "ccdccc" format are space filled in the relevant positions. This field is rearranged to the proper layout by the user program's output formatting routines. All dates are stored in 5 character Julian format, e.g. 93003 equals January 3, 1993. Dates before 1900 or after year 2000 must be determined by their context usage (good luck). Some will notice that the database no longer contains station location information. This information is no longer supplied nor available from the FCC since it is no longer a part of their record keeping (See the May 1993 QST for more info). Cross Reference Information Callsigns in the database are now cross-referenced to both the current and the previous call sign for each entry in which they are available. A cross reference record takes the form of 'old,new' with no other information in the record. A record can be identified as a cross reference either one of two ways: First, if the record length is less than 15 characters, then it is a cross reference record. Secondly, if the record contains only one comma "," , then it is a cross reference record. It is not necessary to test for both cases, either will do. When a cross reference record is encountered, you must fetch the second field and restart the search to return the primary reference. UNIX VERSION Another copy of the database exists on the CD in the /unix directory. Full UNIX source code is included in this directory as is a readme file explaining its use. Thank you for buying the QRZ! Ham Radio CDROM New programs, and other submissions to future QRZ! Ham Radio CDROM editions should be sent to Walnut Creek CDROM, preferably by anonymous ftp to ftp.cdrom.com.